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We present quantum protocols for executing arbitrarily accurate 7r/2 fe rotations of a qubit about 
its Z axis. Reduced instruction set computing (risc) architectures typically restrict the instruction 
set to stabilizer operations and a single non-stabilizer operation, such as preparation of a "magic" 
state from which T = Z(n/4) gates can be teleported. Although the overhead required to distill 
high-fidelity copies of this magic state is high, the subsequent quantum compiling overhead to realize 
Z rotations in a RISC architecture can be much greater. We develop a complex instruction set 
computing (CISC) architecture whose instruction set includes stabilizer operations and preparation 
of magic states from which Z(n/2 k ) gates can be teleported, for 2 < k < fc ma x- This results in a 
substantial overall reduction in the number of gates required to achieve a desired gate accuracy for 
Z rotations. The key to our construction is a family of shortened quantum Reed-Muller codes of 
length 2 fc+2 — 1, whose magic-state distillation threshold shrinks with k but is greater than 0.85% 
for k < 6. 

PACS numbers: 03.67.Lx 



I. INTRODUCTION 

One of the biggest challenges in quantum information 
science is that quantum information is incredibly frag- 
ile. Even with great experimental care, decoherence can 
quickly corrupt key features such as superposition and 
entanglement. To circumvent the ravages of decoher- 
ence, one can consider alternative models of quantum 
computation, such as adiabatic quantum computation 
[1-3] , which may offer direct physical immunity to certain 
classes of noise [4-14]. Another approach is to encode 
quantum information redundantly in an error-correcting 
code and process it fault-tolerantly to suppress the catas- 
trophic propagation of errors [15, 16]. Somewhat mirac- 
ulously, this latter approach works, and works arbitrarily 
well, when quantum computations are expressed as quan- 
tum circuits in which each elementary operation has a 
failure probability below a value known as the accuracy 
threshold [17-23]. Estimates for the accuracy threshold 
vary, and depend in part on the specifics of the fault- 
tolerant quantum computing protocol used. One of the 
more favorable estimates is « 1% for a protocol based on 
Kitaev's surface codes [24-27]. An outstanding grand 
challenge in quantum information science is finding a 
way to marry fault-tolerance methods with intrinsically 
robust computational models to achieve fault tolerance 
with more achievable resource requirements [28-31]. 

One of the factors driving up the resource requirements 
in fault-tolerant quantum computing is the need to re- 
strict the set of elementary operations in the "primitive" 
or "physical" instruction set to be finite. This is neces- 
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sary because these instructions are presumed to be implc- 
mentable only up to some maximal accuracy. One of the 
main jobs of a fault-tolerant quantum computing proto- 
col is to define how one should sequence these primitive 
instructions together to synthesize arbitrarily accurate 
versions of each element of a universal "encoded" or "log- 
ical" instruction set, even when the primitive instructions 
themselves are faulty. Then, using these logical instruc- 
tions, one can realize any quantum algorithm arbitrarily 
reliably, even in the face of decoherence and other sources 
of noise. 

In a typical fault-tolerant quantum computing proto- 
col, some logical instructions are "easy" to synthesize 
in that their error is solely a function of the errors in 
the primitive instructions from which they are composed. 
The accuracy of these logical instructions can be im- 
proved arbitrarily well by using arbitrarily good quan- 
tum codes. More quantitatively, the number of gates 
and qubits required to achieve approximation error e for 
the "easy" instructions scales as C(log Q (l/e)), where a 
depends on the protocol, predominantly on the quantum 
code and classical decoding algorithm it uses. Standard 
techniques for realizing such gates include transversal ac- 
tion [22, 23] and code deformation [25, 26]. 2D topolog- 
ical codes using most-likely-error decoding can achieve 
a = 3 [25, 26]; Pippenger has conjectured that it should 
be possible to lower a all the way to 1 [32]. 

Most protocols also have a set of logical instructions 
that are "hard" to synthesize, requiring additional meth- 
ods and resources. The Eastin-Knill theorem, for ex- 
ample, guarantees that no protocol can realize a uni- 
versal logical instruction set by transversal action alone 
[33] . A typical approach to synthesizing these hard logi- 
cal instructions is to use the "magic state" approach, in 
which the "hard" instructions are state preparations that 
are distilled to high fidelity using the "easy" operations 
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[34]. The number of ideal gates and qubits required to 
achieve approximation error e in this approach scales as 
O(log' 3 (l/e)), where /? depends on the magic-state dis- 
tillation protocol. When the the resource costs for the 
"easy" gates are also considered, the combined overhead 
scales as C(log Q+ ^(l/e)). In the original Bravyi-Kitaev 
15-to-l distillation protocol [34], (3 = log 3 15 w 2.47. 
More recent constructions by Bravyi and Haah [35] and 
by Jones [36] achieve (3 = log 2 3 ~ 1.58. Bravyi and Haah 
conjecture that it should be possible to lower j3 all the 
way to 1 [35]. 

As an aside, it is worth mentioning that fault-tolerant 
quantum computing protocols based on some quantum 
codes have no "hard" logical instructions at all. For ex- 
ample, the 3D (and higher-dimensional) topological color 
codes have this feature [37, 38] . They cleverly circumvent 
the Eastin-Knill theorem by making (non-transversal!) 
quantum error correction be the process by which magic- 
states are prepared. A challenge to using these codes in 
practice is that implementing them without relying on 
long-distance quantum communication requires 3D spa- 
tial geometry, but many quantum technologies are nat- 
urally restricted to ID or 2D. Even more challenging is 
that the only explicit 3D color code of which we are aware 
is the 15-qubit shortened quantum Reed-Muller code [37]. 

Because of the additional overhead incurred in synthe- 
sizing "hard" logical instructions, research to date has 
focused on what one might term reduced instruction set 
computing, or RISC architectures in which only a single 
"hard" logical instruction is added to an otherwise "easy" 
logical instruction set. However, in the big picture, the 
logical instructions are intended to be used to execute 
quantum algorithms, and constraining oneself to a RISC 
architecture is not necessarily the wisest choice for re- 
ducing the overall qubit and gate overhead. In order to 
compile the logical instructions into a sequence that ap- 
proximates a quantum computation with error at most 
e, one must use O(log 7 (l/e)) gates, where 7 depends on 
the quantum compiling algorithm used. The overall cost 
of fault-tolerantly implementing a quantum computation 
is then 0(log a+ ^ +7 (l/e)). By increasing the size of the 
instruction set so that one has a complex instruction set 
computing, or CISC architecture, one can optimize both 
ft and 7 together rather than separately. When quan- 
tum compiling is optimized independently, 7 can be no 
lower than 1 [39], a value recently achieved by an ex- 
plicit Diophantine-equation-based algorithm by Kliuch- 
nikov et al. [40]. For comparison's sake, the more well- 
studied Dawson-Nielsen variant of the Solovay-Kitaev al- 
gorithm achieves 7 = log 5/ log(3/2) ss 3.97 [41]. 

To compare and contrast the RISC and CISC approaches 
more concretely without being encumbered by details 
of quantum error correcting codes and fault tolerance 
(which only contribute to a and a delineation of which 
logical instructions are "easy" or "hard" — properties 
shared by both approaches), we abstract these details 
away and simply consider the straightforward problem 
of how to approximate ir/2 k rotations of a qubit about 



its Z axis with a desired error at most e' when we are 
given the ability to perform a proscribed set of "easy" 
instructions that are error-free and a proscribed set of 
"hard" instructions that have error at most e > e'. In 
this setting, it is clear that some kind of distillation of 
the hard instructions will be necessary to synthesize the 
Z rotations with lower error. Z(n/2 k ) rotations are a 
natural candidate transformation to use to compare RISC 
and CISC approaches, because they arise in many quan- 
tum algorithms, for example those that make use of the 
quantum Fourier transform [42]. 

In Sec. II, we formulate the statement of the problem 
we are considering more precisely. In Sec. Ill, we review 
the standard RISC solution to this problem. In Sec. IV, we 
describe our CISC solution, and compare it to the RISC 
solution, demonstrating that our solution offers a sub- 
stantial reduction in the number of gates used to achieve 
this task. Sec. V concludes. Appendix A elaborates the 
shortened quantum Reed-Muller codes we use to effect 
our protocol, and Appendix B formulates a testable set 
of criteria one can use to check if a code admits Z(-K/2 k ) 
transversally. 

II. PROBLEM STATEMENT 

Consider quantum Z rotations of the form 

for integers k > 0. As a shorthand, we use Z to de- 
note the Pauli operator Z and S and T to denote the 
rotations Z\ and Zi respectively. We are interested in 
the scenario in which the Z^ gates are not available di- 
rectly, but rather their action on |+) states is, where 
|+) := H\0) = (|0) + |1))/V2 and H := (X + Z)/y/2. 
For concreteness, let i? femax denote the set of states of the 
form 




for 2 < k < fc max . 

In conjunction with the set S of stabilizer operations 
[43], the set i?fc max can effect universal quantum compu- 
tation, even when restricted to fc max < 2 [42]. We are 
interested in the scenario in which a certain (overcom- 
plete) generating set for S is available, namely the set 
consisting of the operations 

{7, A, Y, Z, S, St, ff}u{|0>, |+>, M z , M x ) (3) 

and 

{A(X qi (g) •• • (g) X qm ) I qt G {0, 1}}, (4) 

where I, X, Y, and Z denote the Pauli operators, Mx 
and Mz denote projective measurements in the X and Z 
bases (but which may be "destructive" in that they do 
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not necessarily prepare X or Z eigenstates after the mea- 
surement), and A(X q ) denotes the one-control, many- 
target controlled-NOT gate, where the number of targets 
m is some efficiently computable number. The unitary 
gates in this generating set generate a subgroup of the 
stabilizer operations known as the Clifford operations 
[43], which are the set of operations that conjugate (ten- 
sor products of) Pauli operators to (tensor products of) 
Pauli operators. 

These generators of S are "easy" to perform at the 
logical level for the 4.8.8 2D color codes, motivating our 
choice [38]. The set is also almost "easy" for Kitaev's 
2D surface codes [24], except generating S and S> re- 
quires some constant startup costs that can be amortized 
[44] . Amazingly, as alluded to in the introduction, all el- 
ements from the set S U Z-i — a universal set — are "easy" 
to perform at the logical level for 3D color codes, but 3D 
geometries are required to realize error correction with 
these codes in a spatially local manner [38]. 

While errors in the "easy" operations can be sup- 
pressed arbitrarily low using 2D topological codes, errors 
in the operations in i?fc max cannot, making these opera- 
tions "hard" for these codes. The states in -2<fc max can be 
"injected" into such codes at the logical level [26], but do- 
ing so also injects the errors in the state. In other words, 
if the states in 2>k max have errors that are at most e (as 
measured by the trace distance [42]) as primitive instruc- 
tions, then the injected states will have errors that are 
essentially the same when they become logical instruc- 
tions, assuming the injection process itself adds errors at 
a low enough probability [84] . 

Motivated by these properties of 2D topological codes, 
we will fix the control model for our study to be the 
aforementioned generators of S and -2jt max , and the error 
model to be one in which the operations in S are error- 
free but in which the Zk\+) states in -2jt max each err by 
at most e, as measured by the trace distance. Notice 
that this control model makes no reference to codes or 
fault-tolerant quantum computing protocols. We have 
abstracted these away to focus on how to combine el- 
ementary operations in S and -Zfc max to achieve high- 
fidelity Z rotations. 

The question we address here is, 



How many elementary quantum operations 
from S and Zk mms does it take to approximate 
Zk with error at most e' < e as a function of 
kmax, k, e, and e' ? 



The values of k we are interested in could be smaller 
than, equal to, or larger than fc max . However, since Zq 
and Zi are both in the error-free set S, we are only in- 
terested in k > 2. 



III. TRADITIONAL QUANTUM RISC 
ARCHITECTURE SOLUTION 

The standard method for refining the accuracy of a Z k 
rotation in this model is to synthesize it with what one 
might term a quantum reduced instruction set comput- 
ing, or quantum RISC, architecture. The main idea is to 
only synthesize T := Z 2 gates to high accuracy and then 
rely on a quantum compiling algorithm to approximate 
Zk arbitrarily well with a quantum circuit over T gates 
and adaptive stabilizer operations. The overall process 
can be broken into the three steps of quantum compiling, 
quantum gate teleportation, and magic-state distillation. 

A. Protocol 

1, Quantum compiling 

The first step, quantum compiling, generates a classi- 
cal description of an ideal quantum circuit that approx- 
imates Zk to accuracy e qc using 0(log 7 (l/e qc )) quan- 
tum operations drawn from some instruction set, for 
some small constant 7. While the error e qc can be 
measured in multiple ways, a wise choice is to measure 
e qc using the completely-bounded ( "diamond" ) trace dis- 
tance [19, 45, 46] for reasons that we will explain later. 
Examples of quantum compiling algorithms include the 
Solovay-Kitaev algorithm [19, 39, 41, 42, 47-49], the Ki- 
taev phase kickback algorithm [50-52], programmed an- 
cilla algorithms [36, 53, 54], genetic algorithms [55], and 
even Diophantine-equation algorithms [40] . When the ac- 
curacy demand is not great, it is sometimes even plausi- 
ble to use algorithms which take exponential time to find 
very short approximation sequences [56-59] . As noted in 
the introduction, values for 7 range from 3.97 to 1. 

Quantum compiling algorithms typically assume that 
the elements of the instruction set are error-free. If one 
implements the compiled circuit Z^^ for Zk with oper- 
ations that may be in error, the resulting approximation 
error will increase. To calculate the total error e k in this 
flawed circuit Z^ c \ we use the fact that the diamond 
norm has many useful properties, including obeying the 
triangle inequality, the chaining inequality, and unitary 
invariance [60]. Using these, we can bound e k as 

e k = d,(z k ,Z^ c) ) (5) 

< d (z k ,Z^)+d (zl^,Z^) (6) 
<e qc + n T e T , (7) 

where the compiled circuit uses ut T gates, each with 
error at most ey. To achieve the desired approximation 
error of e', it follows that sufficient conditions are 

e qc < C qc e' (8) 
e T < C T e'/n T , (9) 
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for positive constants constrained to obey 

C T + C qc < 1. (10) 

2. Quantum gate teleportation 

The second step, quantum gate teleportation, replaces 
each T gate in the quantum-compiled circuit by an adap- 
tive stabilizer circuit that teleports the T gate from the 
state T\+) or T^|+) to the desired qubit. An example of 
a teleportation circuit using T\+) is depicted in Fig. 1. 
The circuit is also correct if both T operators are changed 
to ; it is even correct if only one of the T operators is 
changed to a if the classical control is also changed to 
act on a instead of a 1. 



w> - 

T\+) -6 



S - TW 



Mz 



FIG. 1: Circuit for teleporting the T gate from the T\+) 
magic state. 

It is worth remarking that the classically controlled 
S gate in this circuit is not a Clifford gate because of 
its adaptive nature; it is a common misconception that 
the adaptive S gate is a Clifford gate. While the fact 
that an adaptive S gate is non-Clifford follows mathe- 
matically from the fact that a bit flip on the classical 
control propagates to a non-Pauli operator, it is perhaps 
more amusing and convincing to consider the following 
consequence if the adaptive S gate had been a Clifford 
gate. Suppose our original circuit was a "programmable" 
circuit in which m of the qubits specified the quantum 
circuit to be implemented and n of the qubits described 
the initial state of the qubits it was supposed to act on. 
Without loss of generality, the state of these m + n qubits 
are describable by a classical (to + rt)-bit string. Sup- 
pose further that we replaced each T gate in the circuit 
with the circuit of Fig. 1 and that it had been a Clif- 
ford circuit. Finally, suppose we executed this circuit 
using some quantum device and read out the answer, af- 
ter which the quantum device that executed the circuit 
was irrevocably destroyed. No problem! To run the cir- 
cuit on any other (to + n)-bit input string, we can just 
propagate the Pauli X-flips on the input through to a 
set of Pauli X flips on the answer we read out, obtaining 
the output on the new input. The propagation is guar- 
anteed to be efficiently implementable by any of a variety 
of explicit algorithms derived from the Gottesman-Knill 
theorem [43, 61, 62]. By this logic, there would never be 
a need to run a quantum algorithm more than once on a 
given instance size, since the output of the algorithm on 
all other instances of the same size would be efficiently 
computable from it with classical post-processing. Even 
more amazingly, running the quantum algorithm once is 
sufficient to determine the output of any other quantum 



algorithm requiring the same number of bits for spec- 
ification! Quantum computers are great, but not that 
great — the adaptive S gate is not a Clifford operation. 

Each teleportation circuit, not counting preparation of 
its inputs, adds one A(X) gate, one I operation, one Mz 
measurement, and half of the time, one S operation, for a 
grand total of 3.5 operations on average and 4 operations 
in the worst case. Since teleportation is only performed 
once per T gate, we use the average value of 3.5. 



3. Magic-state distillation 

The third step, magic-state distillation, generates T\+) 
or T , t|+) states with accuracy ct from a much larger 
collection of states whose accuracy is only e. Reichardt 
showed that this is possible using an ideal (error-free) 
stabilizer circuit if and only if e is less than the distil- 
lation threshold (2 - %/2)/4 w 0.146 [63]. When opera- 
tions in the stabilizer circuit can err, the evaluation of 
the threshold is more complex, as studied by Jochym- 
O'Connor et al. [64]. 

There are multiple variations on how to implement 
magic-state distillation discussed in the literature [34, 
35, 65-68]; a popular one is the 15-to-l Bravyi-Kitaev 
protocol [34] based on the 15-qubit shortened quantum 
Reed-Muller code QRM(1,4). (See Appendix A for an 
explanation of this notation.) 

In their original paper, Bravyi and Kitaev proposed 
the following distillation protocol: 

1. Prepare the state (T|+))® 15 . 

2. Apply A := TXT^ ~ SX with probability 1/2 on 
each qubit. 



3. Measure the Z checks for QRM(1,4). 

4. Identify qubits to flip to reset the Z-check syn- 
drome to 0. 

5. Apply A to the identified qubits. 



6. Measure the X checks for QRM{ 1,4). 

7. Declare failure if the X-check syndrome is not 0. 
Otherwise, proceed to the next step. 



8. Apply the coherent decoding circuit for QRM(l, 4). 
The output is Tr|+) with a higher fidelity. 

Because this algorithm uses many gates and qubits for 
syndrome measurement and correction, attempts have 
been made to simplify the circuit. For example, the X- 
check measurements can be pushed through the decoding 
circuit to become simpler individual Mx measurements. 
Some have attempted to push the Z-check measurements 
and A corrections through the circuit as well. Because 
the A operators are not Pauli operators, their propaga- 
tion through the Clifford decoding circuit is complicated. 
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The equivalent post-decoding-circuit Z-check measure- 
ments become elaborate entangled measurements that 
cannot be reconstructed by simply measuring each qu- 
bit individually, despite suggestive circuits drawn in the 
literature [27, 69]. 

A clever alternative by Raussendorf et al. obviates the 
need for measuring complicated operators by encoding 
half of a Bell state and processing it to distill the output 
on the other half [26]. Their protocol is as follows: 

1. Prepare the state |$+) := (|00) + \ll))/y/2. 

2. Send half of |$ + ) through the coherent encoding 
circuit for QRM{\,4). 

3. Prepare the state (T|+))® 15 . 

4. Apply A with probability 1/2 on each of these 15 
qubits. 

5. Teleport the T gate from each of these "twirled" 
T\+) states to a corresponding qubit on the en- 
coded half of |<f>+). 

6. Measure Mx on each of the 15 encoded qubits. 

7. Infer the X-check values from the appropriate prod- 
ucts of these Mx measurements. Declare failure if 
the A-check syndrome is not 0. Otherwise, proceed 
to the next step. 



8. Infer the logical X value for QRM(1,4) by taking 
the product of all of the Mx values. If it is —1, 
apply Z to the other half of the original Bell state. 
The unmeasured half of the original Bell state is 
T^\+) with a higher fidelity. 

The inclusion of the twirl operation is omitted by 
Raussendorf et al. and by many others who have built 
upon this protocol [70, 71]. However, twirling is essential 
in the analysis by Bravyi and Kitaev in deriving Eq. (11) 
below for the accuracy of the distilled output state: 

M l-(l- 26) 7 (306 +(l-2 £ ) 8 ) 



2(1 + 15(1 -2e) 8 ) 



35e 3 . 



(12) 



That said, Jochym-O'Connor et al. have discovered that 
magic-state distillation works at least as well, and maybe 
even better, when twirling is omitted in a distillation 
protocol based on the five-qubit code [64]. Inspired by 
this result, we too omit the twirling circuits from the 
protocol yet still use the Bravyi-Kitaev formula as a sort 
of loose guideline for how much the fidelity has improved, 
expecting that the fidelity increase may even be better. 

A quantum circuit that implements the Raussendorf 
et al. protocol is depicted in Fig. 2. 

The final Z correction from step 8 that may need to 
be applied is not depicted in Fig. 2 because it can be 
incorporated into the subsequent teleportation circuit in 
Fig. 1 instead at a lower gate cost. To do this, we just 
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FIG. 2: Distillation circuit for 2^1+) states, adapted from 
Ref. [26]. After the Bell state preparation, the gates perform 
the coherent encoding circuit for the 15-qubit shortened quan- 
tum Reed-Muller code. (See Appendix A for details.) The T 
gates are performed by gate teleportation, using the circuit 
from Fig. 1. This distillation circuit also distills T|+) states 
on T f |+) inputs. 



replace the / gate there with a Z gate, as needed. Alter- 
natively, we could keep the I gate and replace the S gate 
with SZ = SF Either way, the number of gates in the 
teleportation circuit is 3.5 on average and 4 in the worst 
case. 

Let us now count the number of gates used by the 
Raussendorf et al. protocol. 

To achieve e out < e^, one iterates this distillation pro- 
cess £ times, where 



(er,e) 



log e T 
log e ou t(e) 



(13) 



In addition to this, each of these rounds must them- 
selves be repeated t times because the X checks may 
fail to give a trivial syndrome. Because low-error states 
decode with higher probability, the expected number of 
repetitions is small when e is small. Specifically, the ex- 
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pected number of times each round must be repeated is 

16 



where n^g C \Zk, e) and n^'(Zk, e) denote the number of 
stabilizer operations and T gates found by some quantum 
compiling algorithm that approximates Zk to within e. 

To better appreciate the compiling resources needed, 
we consider the case when C qc = Ct = 1/2, which bal- 
ances the quality demands of quantum compiling and 
magic-state distillation. Let us also give the T gate a 
generous error rate of e = 10~ 4 , which is well below 
the estimated threshold of ~ 1% for fault-tolerant quan- 
tum computation with surface codes [26, 27]. To gener- 
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E[t(e)] 



1 + 15(1 - 2ef 



(14) 



We can compress identity gates out of each "wire" of 
the distillation circuit by moving each preparation as far 
to the right as possible and each T/M x pair as far to the 
left as possible. While this by-hand compression of the 
distillation circuit is not necessarily optimal (a different 
choice of generators could be used, for example), it re- 
duces the number of stabilizer operations in the circuit 
of Fig. 2 substantially. The number of gates occurring 
before the T gates is compressed to 35 — 16 preparations, 
5 A(X q ) gates, 10 identity gates interior to the A(X q ) 
gates, and 4 identity gates on the bottom qubit. Each 
T/Alx pair requires 5.5 operations on average (a count 
which includes the preparation of the T\+) state in the 
teleportation circuit of Fig. 1), adding 82.5 gates on av- 
erage. The bottom qubit must idle during this, incurring 
1 identity gate during the final Mx measurements and 
3 identity gates during the T teleportations before that, 
as it is exceptionally likely (probability 1 — 2~ 15 ) that at 
least one T teleportation circuit will require a corrective 
step. This leads to a grand total of approximately 121.5 
expected gates on average. 

We could reduce the number of qubits used from 31 
to 17 by reusing the same qubit for each teleportation 
circuit. As noted in Ref. [69], we could further reduce the 
number of qubits by reusing the same qubit for each of 
the first four control lines, bringing the qubit total down 
to 14. However, each of these modifications introduces 
additional qubit idling in the form of extra identity gates. 
Since we are interested in reducing the number of gates, 
we do not pursue these optimizations. 



B. Resource analysis 

As mentioned in the introduction, asymptotically the 
total number of operations required to approximate a Zk 
gate with error e' is C(log Q+/3+7 (l/e')), where the expo- 
nents describe various overheads of the steps involved: 
fault-tolerant stabilizer operations (a), magic-state dis- 
tillation (/3), and quantum compiling (7). While a good 
starting point, asymptotic analysis like this fails to con- 
vey the great number of elementary operations needed to 
implement Z% gates, as it sweeps the (large!) constants 
under the rug. The explicit expression for the expected 
number of gates used by the RISC approach to approx- 
imate Zk to error e' using T\+) states whose error is e 
is 



n™UZ k , e', e) < n^ c) (Z fe , C7 qc e') + n { T qc) (Z k , C qc e') 

xr4 ISC (c T e74 qC) (^C qc 6'),e), 

(15) 

r4 IS0 (e, e) = (121.5)E[t(e)]£(e, e) + 3.75, (16) 



ate values for 



,(qc) 



and n^g C \ we appeal to the results 
by Kliuchnikov et at [58], who use a streamlined ver- 
sion of the Dawson-Nielsen compiling algorithm iterated 
TXrjN times to achieve a given quantum compiling error. 
They exhaustively tabulated gate counts on their web- 
site http://qcirc.iqc.uwaterloo.ca/ for k up to 29. 
The number of operations n^ftes required to synthesize 
Zk with these parameters to various approximation levels 
and for small values of k are listed in Table I. 



IV. QUANTUM CISC ARCHITECTURE 
SOLUTION 



After witnessing just how many gates are required to 
implement Zk rotations using a quantum RISC architec- 
ture, it's natural to ask if extending the instruction set to 
a quantum complex instruction set computing architec- 
ture, or quantum CISC architecture, could improve mat- 
ters. The point is that in any given quantum algorithm 
instance, one isn't interested in applying arbitrary gates 
but rather a specific set of gates, say Zk gates up to 
some maximum value of k in a quantum Fourier trans- 
form. Because of this, it may make more sense to just 
include those gates in the instruction set to begin with 
rather than compiling them from a more limited instruc- 
tion set. Even if it is only feasible to include gates up 
to some value of Zfc max , it is reasonable to expect that 
the quantum compiling task that remains is much easier 
than if one had restricted oneself to a smaller instruction 
set. 



A. Protocol 

Here we consider a programmed-ancilla CISC architec- 
ture, in which we pre-compile Zk\+) states offline that 
can be used later to teleport the gate Zk on demand via 
the circuit in Fig. 3. While the teleportation may re- 
quire a Zk-\ gate for correction, iterating this process 
recursively is a negative binomial process that converges 
exponentially quickly — the expected number of Z rota- 
tions for any k is two: Zk on |+) and Zk-\ after the 
measurement. Because of this, to achieve error at most 
e' on the teleported Zk gate, the Z&\+) state and the 
Zk-i gate need to be performed with errors at most C\f! 
and Cit' respectively, where C% + C2 < L 
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npN 

1 
2 
3 
4 
5 
6 
7 



3 x 10 
9 x 10 
1 x 10 

6 x 10 

7 x 10 
7 x 10 
9 x 10 
1 x 10 



-3 

-5 

-6 

-10 

-15 

-22 

-33 

-50 



3 
27 
170 
1244 



"'gates 

3 551 
16 735 
84886 
815617 
554634 
443 509 
821010 
288 695 



8 x 10" 

3 x 10" 

9 x 10" 
2 x 10" 

5 x 10" 
2 x 10" 
2 x 10" 
2 x 10" 

2 x 10" 
2 x 10" 

4 x 10" 

2 x 10" 

6 x 10" 

3 x 10" 

7 x 10" 

4 x 10" 

2 x 10" 

2 x 10" 

7 x 10" 

8 x 10" 

3 x 10" 

5 x 10" 

4 x 10" 

5 x 10" 



3 
26 
167 
1235 



3 
26 
164 
1239 



3 
25 
171 
1250 



3 045 
15 712 
70 434 

745 089 
490 498 
861 004 
324474 
755 305 

2 787 
17 234 
71453 

766 508 
519 361 
241651 
504 724 
288 461 

3 550 
17 233 
80 318 

746 097 
408 145 
181767 
199 899 
275 506 



"'gates 

5 

377 
377 
687 
748 
1120 
1491 
2173 



z k \+) -e 



Zk-i — Zk\ 



Mz 



6 

901 
901 
1795 
1795 
2 690 

4 294 

6 053 

6 

2 018 
2018 
4030 

5 595 

7 607 
9 974 

15 548 

6 

4 701 

8 390 

9 396 
13 867 
18 561 
27122 
40 974 



TABLE I: Expected number of gates required to approxi- 
mate Zk to precision e' derived from the results of Kliuch- 
nikov et al. [58], which uses the Dawson-Nielsen RISC architec- 
ture with Bravyi-Kitaev 15-to-l magic-state distillation [34], 
and the corresponding number of gates required by our CISC 
architecture. For each number of iterations of the Dawson- 
Nielsen algorithm tidn, the table lists the smallest output 
error achievable e', rounded up to its most significant figure, 
assuming that quantum compiling and magic-state distilla- 
tion contribute equally to the total error and that the bare 
error rate of Zk operations is e = 10 -4 . 



Our CISC approach is distinguished from previous 
programmed-ancilla approaches [36, 53, 54] in that we 
distill ancilla Z k \+) states directly as instructions unto 
themselves. This is a "top-down" approach in which 
some of the time auxiliary Zk—i\+) states are needed, and 
even less of the time Z k _ 2 \+) states are needed, and so 
on, until we get to the point that very rarely do we need 



FIG. 3: Magic-state circuit for teleporting the Z k gate. 



T\+) states. The previous approaches are "bottom-up" 
in that they always compile from T\+) states upwards 
until the Z k gate is performed; some of these schemes 
(notably the recent one by Duclos-Cianci and Svore [54]) 
reduce resources by including intermediate targets, but 
ultimately they all start from T\+) preparations at the 
lowest level. By starting from the top, we avoid the need 
to probe all the way to the bottom most of the time. 
As we will see, this results in significant savings in the 
number of operations needed to synthesize Z k gates. 

The key to our construction is a family of shortened 
quantum Reed-Muller codes that are defined in Ap- 
pendix A. The property of these codes that we harness 
here is that the QRM(1, k + 2) codes admit the logical Z k 
gate transversally, namely by applying Zj, to each qubit 
independently. We know this because these codes satisfy 
the conditions we derived in Appendix B. Because of this 
transversality property, we can use the QRM(l,k + 2) 
code to distill zl\+) states using a circuit that is essen- 
tially the same as the one used in the RISC architecture 
for distilling T gates. Specifically, if we replace the en- 
coding circuit for QRM(1,A) with the encoding circuit 
for QRM(l,k + 2) and replace each T with Z k in the 
distillation circuit depicted in Fig. 2, the circuit becomes 
a distillation circuit for states. As an example, we 

depict the distillation circuit for z\ in Fig. 4; we derived 
the encoding circuit for QRM(1, 5) in the figure using the 
methods outlined in Rcfs. [42, 72]. Wc defer a proof of 
why these codes have the transversality property to Ap- 
pendix B and instead focus on how the protocol works 
here. We will note here, though, that our proof gener- 
alizes the "tri-orthogonality" condition that Bravyi and 
Haah used to establish the transversality of T gates for 
their codes to a lemma in coding theory proved by Ward 
that we call Ward's Divisibility Test [73, 74]. 

Using the QRM(1, k + 2) code to distill Z k gates yields 
the following distillation polynomial, which generalizes 
Eq. (11): 



1-(1 



2[1 



2e(2 fc + 2 - 1) + (1 - 2ef 
(2 fc + 2 -l)(l-2e) 2fc+1 l 



1-3-2 



k + l 



+ 2 2fe+3 ) (e 3 /3 + e 4 + 0(e 5 )). 



(17) 



(18) 



Approximate values for the distillation threshold for var- 
ious values of k are listed in Table II; these are the same 
threshold values one obtains if one used the code to dis- 
till Z k+ i 1 but with only a linear, not cubic, distillation 
polynomial by generalizing the method of Reichardt [63] . 




4\+) 

FIG. 4: Distillation circuit for Z\\+) = VT t \+) states; it is 
the 31-qubit shortened quantum Reed-Muller code's encoding 
circuit applied to half of a Bell state followed by the logical Zz 
gate and Mx measurement of the qubits on this encoded half. 
The Z3 gates are performed using the teleportation circuit 
depicted in Fig. 3. This circuit also distills Z3I+} states on 
Z3I+} inputs. 



Although the distillation threshold drops as k in- 
creases, it is still larger than or comparable to the thresh- 
old of f=a 1% for fault-tolerant quantum computation with 
surface codes [25-27] for values of k less than or equal to 



k 


tout A 3 


th 
£fc 


2 


35 


14.15% 


3 


155 


6.94% 


4 


651 


3.44% 


5 


2 667 


1.71% 


6 


10 795 


0.85% 


7 


43 435 


0.43% 


8 


174 251 


0.21% 


9 


698 027 


0.11% 


10 


2 794 155 


0.05% 



TABLE II: Distillation polynomials (to most significant order) 
and distillation thresholds for distilling \Z\) states. 



6, where it takes the value eg « 0.85%. This then sets 
a reasonable upper limit on the size of the complex in- 
struction set one should consider for performing Zf~ gates 
in this way; going further would place greater fidelity de- 
mands on the elementary operations than fault-tolerance 
does. 

To achieve e out < e', one must iterate the distillation 
circuit 



log e' 



log e ou t(e) 



(19) 



times. The expected number of repetitions per itera- 
tion needed to achieve distillation success, generalizing 
Eq. (14), is 



E[i(e)] = 



2 fc+2 

l + (2 fe + 2 -l)(l-2e) 



2 fc+i 



(20) 



Unlike in the RISC protocol, in which the corrective 
step in the teleportation circuit added no error, in our 
protocol each teleportation circuit may add error in its 
adaptive Z1.-1 gate. While we could implement the Z^-i 
gate with low error using our protocol recursively, this 
adds a great many resources for little benefit. Instead, 
we simply use a "bare" Zk-i gate that relies on a Zfc_i|+) 
state prepared with error e. 



B. Resource analysis 

Asymptotically, our CISC protocol achieves a value of 
p = p k ■= log 3 (2 fc+2 - 1) and 7 = 0. The sum p + 7 
is less than the sum of the 15-to-l Bravyi-Kitaev magic- 
state distillation p and the Dawson-Nielsen compiling 7 
for k < 9. However, since the distillation threshold drops 
below 0.85% after k = 6, as argued earlier, it is probably 
wisest to stop at k = 6. Compared to the best values 
we know for P (s» 1.58 by Refs. [35, 36]) and 7 (1 by 
Ref. [40]), our CISC protocol would appear to be only 
superior for k < 2. However it is important to remember, 
as mentioned earlier, that arguing about asymptotics in 
this way can be very misleading as the constants involved 
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can be huge. For this reason, as before, we shift our 
attention to a full accounting the expected number of 
gates used. 

The explicit expression for the expected number of 
gates used by our CISC approach is just the sum of the 
number of gates used to distill a Z k \+) state and the 
number of gates used to perform the final teleportation: 

n gates(Zfc> e '! e ) = n dism(Zk,e',e) 

+ n te leport(-£fc,e',e). (21) 

This is simpler than the corresponding RISC formula (15) 
in that quantum-compiling factors do not appear, but 
more complex in that the expressions for each of the 
terms is more involved. 

From Fig. 3, the final teleportation circuit, not count- 
ing preparation of its inputs, adds one A(X) gate, one / 
operation, one Mz measurement, and half of the time, 
one Zk-i operation that must be applied with error at 
most C 2 £' ■ Additionally, a Z operation may need to be 
applied to the input Z k \+) state depending on the out- 
comes of the Mx measurements in the circuit which dis- 
tilled Z k \+); the Z correction will be needed half of the 
time. We can propagate this through the circuit to the 
final step, where it acts on the top qubit. By replacing 
the / gate with the Z gate as we did in the case of k = 2, 
the expected number of gates that teleportation adds is 
three plus the expected number required for the Zk-i 
correction: 



"tclcport (Z fc ,e / ,e) = 3 + 



fc gatcs 



(Z fc _i,C 2 e , ) e). (22) 



In the worst case, the factor of 1/2 becomes 1, but we 
only consider the average case here for the very final tele- 
portation as we did in our RISC analysis. 

As with the RISC protocol, we compress the number 
of gates added by the distillation circuit before count- 
ing them. Regardless of any special structure to the 
QRM(l,k + 2) code, we know that we can remove the 
identity gates on the first and last controlled-operations 
as well as the identity gates on the qubits involved in con- 
trol lines. The total number of gates in the compressed 
distillation circuit before the Z k gates is therefore 



"pre Z k 



= 2 fe+2 preparations 

+ (jfe + 3) A(X q ) ops 

+ 2 k+1 (k + 1) + 1 I during A(X q ) ops 

- (k + l) 2 I on A(X q ) controls (23) 



= 2 fe+1 (/c + 3)-fc 2 -fc + 3. 



(24) 



It is possible to compress the distillation circuit further 
by taking advantage of structure in the QRM(1, k + 2) 
codes arising at individual values of k. For example, the 
approach just described only compresses the distillation 
circuit for Z 2 down to 37 gates whereas we were able to 
compress it down to 35 by examining Fig. 2. The code's 



special structure allows one to move the leftmost gate 
on qubit 5 one more step to the right and the rightmost 
gate on qubit 10 one more step to the left. Similarly, one 
can shave 10 extra gates off of the gate count for the Z 3 
circuit, 32 gates off of the Z 4 circuit, and 84 gates off of 
the Z$ circuit. We stopped at Z5, but such by- hand gate 
reductions could be continued further. Also, as we did 
for the Z 2 distillation circuit, we passed up compressions 
in qubit number that increase the number of gates used. 

After the Z k gates occur, the number of gates is just 
"post z k — 2 fc+2 for the M x measurements and the con- 
current I gate. 

The expression for the number of gates during the tele- 
portation steps for Z k is not the same as that in Eq. (22). 
Unlike in that setting, we must explicitly count the prepa- 
ration of the Z k \+) states. However, we need not account 
for a Z correction coming from a distillation circuit be- 
cause the Z k \+) states are not distilled. The number of 
gates that a single Z k gate requires is therefore a sim- 
ple recurrence relation that accounts for the four gates 
in Fig. 3 (Z k \+), A(X), I, and M x ) and half of the time 
a corrective Z k _\ gate: 



n. 



(i) 

tele 
'tele 



(k) 



i) 



(i)=i. 



Solving this recurrence relation, we obtain 



2 fe-i' 



(25) 
(26) 



(27) 



which asymptotes to 8 for large k. The total number of 
gates used by teleportation is 2 fc+2 — 1 times this value: 

nteie(fc) = (2 fc+2 - 1) (8 - 14 • 2- fc ) . (28) 

Finally, the last qubit in the distillation circuit (the one 
on the other half of the Bell pair) must idle while all these 
teleportation steps are happening. The first teleportation 
can be scheduled so that the Z k \+) preparations are co- 
incident with the last A(X q ) operation, causing just two 
/ gates to occur on this last qubit before the Z k _\ gates 
are optionally called. The last qubit must idle further if 
even a single one of the teleportations requires a Z k -i 
corrective step. This additional idle occurs with proba- 
bility (1 - (1/2) 2 We will consider a scenario m 
which the Z k _ 1 \+) states are not prepared until we know 
they are needed. The next teleportation iteration will 
incur three more I steps on the last qubit plus the time 
it would take to perform a Z k ^ 2 correction, if needed. 
Since the expected fraction of original qubits requiring 
Z k -i gates is 1/2, the probability that at least one Z k - 2 
correction is needed is (1 — (1/2)^ 2L+2_1 ^ 2 ). Using a as 
a shorthand to denote 2 fe+2 — 1, we see that the expres- 
sion for the total number of identity gates incurred by 
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the final qubit is 

n It tele(fc) = 2 + (l - 2-"/ 2 °) 3 + (l - 2-°/ 21 ) 



3 + .. 



( 



1 - 2 



-a/2 k 



3(k- 1) -0(2 



-a/2 k 



)• 



(29) 
(30) 



Because the error term is very small, we will neglect 
it and use 3(fc — 1) to represent the number of identity 
gates the last qubit experiences. 

Putting it all together, the number of gates in the cir- 
cuit depicted in Fig. 4 is 



^distill 



^prc Zf 
_ ofe+1 



+ Jltele(fc) + ni t tele(fc) + ' 



2 fc+i (k + 21) - k z + 2k - 64 + 14 • 2 



-fc 



(31) 
(32) 



The circuit in Fig. 4 may need be repeated if the Mx 
measurements fail to distill a state. Moreover, the whole 
process may need to be iterated to distill ultra-high- 
fidelity states. A subtlety is that because we did not 
use distilled Zj\+) states for j < k in the possible cor- 
rective steps required to teleport the Zk gate, the error 
in each Zk gate is not e but rather some multiple of e 
equal to the number of corrective steps required. We ex- 
pect that while all the Zk gates will have error at least 
e, half of the remainder will have error 2e because they 
require an additional telcportation, then a quarter of the 
remainder of those will have error 3e, an eighth of that 
remainder will have error 4e, and so forth until we reach 
the maximum number of errors which is (k — 1), since 
the S gate is error-free. The average error per Zk gate is 
therefore 



3=1 

= 4e [l-2~ fc (fc + 1) 
< At. 



(33) 

(34) 
(35) 



Since which qubits will suffer which multiple of e in 
their error probability is chosen uniformly at random, 
this process is equivalent to the process in which each 
qubit independently undergoes an error process in which 
its error rate is chosen to be e plus an additional e with 
probability 1/2, an additional 2e with probability 1/4, 
an additional 3e with probability 1 /8 and so forth, which 
has an expected error rate of e. Because of the linearity 
of the trace distance, this means that the effective noise 
process is the same as the original one, but with e re- 
placed by e. We can therefore make this replacement in 
the expressions for the number of repetitions t and num- 
ber of iterations I required. The resulting total number 
of gates needed to distill a state in our CISC archi- 

tecture is 



ndistm^e'.e) = n^ tm E[i(e)K(cV, e). 



(36) 



In order to compare the performance of our CISC archi- 
tecture to the RISC architecture, we chose target values 
for e' achieved at various Dawson-Nielsen iterations lev- 
els tidn listed in Table I. We also chose C\ = C2 = 1/2 to 
balance the quality demands of distillation and the final 
teleportation. Our RISC and CISC results are presented 
side-by-side in Table I for ready comparison. 



V. CONCLUSIONS 

Table I summarizes our main results. A graphical way 
of depicting the same data is shown in Fig. 5. 



Log 10 (Gates) 

10r 



RISC 




CISC 

k = 3 k^4 k = 5 k = 6 
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Log 10 (l/e') 



50 



FIG. 5: Log of the number of gates required to synthesize the 
quantum Z(tt /2 k ) gate as a function of the log of the inverse 
of the desired precision e for the RISC architecture described 
in the text and our CISC architecture. 

From the log-log plot in Fig. 5, two features indicating 
the advantages of our quantum CISC architecture over 
the quantum RISC architecture are immediately appar- 
ent. First, our quantum CISC architecture uses signifi- 
cantly fewer gates to synthesize Zk gates than the quan- 
tum RISC architecture does for k up to 6, a gap that 
grows as the precision demand grows. Second, up to a 
precision demand at least 10 -50 , the number of gates 
required by our quantum CISC architecture scales much 
more modestly with the precision demand than the quan- 
tum RISC architecture does. In combination, these two 
features demonstrate that our quantum CISC architec- 
ture is much less resource-intensive than the quantum 
RISC architecture is in this regime. 

The dramatic difference between the architectures at 
low precision demand reflects the fact that when the 
hardware error rate is already below this demand (i.e., 
when e < e'), the only gates required by our quantum 
CISC architecture are those used to teleport the gate Zk 
from the state Zk\+) to the target state \ip). The RISC 
architecture doesn't include the Zk gate for k > 2, so it 
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must instead use a quantum compiling strategy to syn- 
thesize Zk from T\+) states. 

Our CISC architecture does have some limitations. To 
begin, as can be seen in Fig. 5, as k increases, even at 
fixed precision demand e', the number of gates our CISC 
architecture uses increases. Eventually, at any fixed e', 
there will be some k for which the RISC architecture uses 
fewer gates. However, a feature not apparent in this plot 
but apparent from Table II, even before this happens, 
the distillation threshold for our CISC architecture drops 
to a point below the accuracy threshold for fault-tolerant 
quantum computation. Using our CISC architecture be- 
yond k = 6 would be foolhardy, as suddenly the distilla- 
tion of encoded instructions and not the capacity of the 
underlying code would set the experimental hardware de- 
mands at the physical level. For this reason, we advocate 
using our CISC architecture up to k = 6, and then relying 
on an external quantum compiling algorithm (but with 
a much larger base instruction set than a quantum RISC 
architecture would have!) to synthesize Zk rotations for 
larger k values. 

We focused on synthesizing Zk rotations for two rea- 
sons. First, numerous quantum algorithms rely on the 
quantum Fourier transform, which in turn is naturally de- 
composed into Clifford operations and Zk rotations. We 
thought it was important to focus on synthesizing trans- 
formations that arise in actual algorithms rather than 
operations that occur only in the abstract. Second, and 
more significantly, we were able to find a code family, the 
shortened quantum Reed-Muller codes, we could leverage 
to create distillation protocols for Zk rotations. The key 
enabling property these codes possess is code divisibility. 
With this insight, we generalized the "tri-orthogonality" 
condition of Bravyi and Haah [35] to a condition we call 
Ward's Divisibility Test, which recognizes its analogous 
role in classical coding theory [73]. We haven't sought 
codes beyond the shortened quantum Reed-Muller codes 
that pass Ward's Divisibility Test for admitting a Zj.- 
distillation protocol. However, we present and prove the 
correctness of this test in Appendix B in the hopes that 
others will find it helpful in the quest to improve quan- 
tum CISC architectures. 

One of the overall messages of our work is that it is not 
optimal to first optimize the number of gates used to syn- 
thesize a universal instruction set and then optimize the 
number of universal instructions needed to synthesize a 
gate of interest, in this case, a Zk gate. Instead, one can 
reap significant advantages by approaching this as a sin- 
gle optimization problem. The best conjectured asymp- 
totic scaling when approached as two separate problems 
requires a number of gates that scales as C(log 2 (l/e')). 
By approaching this as a single optimization problem, 
one may be able to achieve O(log(l/e')) for the combined 
process. 

The resource tradeoff space for implementing quantum 
operations with finite discrete instruction sets is an area 
ripe for investigation. Beyond just minimizing the num- 
ber of instructions required to approximate transforma- 



tions of interest (our focus here), one might be interested 
in minimizing other metrics, such as the number of qubits 
used, the depth of the approximating quantum circuit, or 
the size of the approximating quantum circuit (which is 
its depth times the number of qubits). Depending on 
the task at hand, one instruction set may be more suit- 
able than another. Investigations along these lines help 
us better understand the limits and capabilities of finitc- 
instruction-set quantum information processing. 
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Appendix A: Quantum Reed-Muller codes 

One of the challenges in discussing quantum Reed- 
Muller codes is that there is not a unique definition 
of what a quantum Reed-Muller code is in the litera- 
ture [34, 75-79]. Fortunately, there is at least a well- 
established definition for what a classical Reed-Muller 
code is. We state the definition for classical Reed-Muller 
codes below, confining our attention to binary codes. 
We refer the reader to standard texts for the definitions 
of supporting concepts such as Boolean monomials and 
GF{2) [80]. 

Definition 1. The rth-order binary Reed-Muller code 
of length 2 m , denoted RM(r,m), is the linear code 
over GF(2) whose generator matrix is composed of row 
vectors corresponding to the Boolean monomials over 
GF(2) 2 of degree at most r. 

As an example, the generator matrix for the RM(1, 4) 
code is 



G = 



1111111111111111 
1111111100000000 
1111000011110000 
1100110011001100 
1010101010101010 



(Al) 



From this definition, the codespace of binary Reed- 
Muller codes is just the space of Boolean polynomials 
over GF(2) 2 of degree at most r. It is a minor com- 
binatoric exercise to work out that the code RM(r, m) 
has rank k = X^=o (T) anc ^ C0( ie distance d = 2 rn ~ r . In 
standard coding theory notation, we say that the code 
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RM(r, m) is an 



,k,d] 



i=0 



(A2) 



code. 

It is straightforward to work out that the dual code to 
RM(r, m) is RM(m — r — 1, m). We use this to define a 
quantum Reed-Muller code as a CSS code composed of 
RM(r,m) and its dual: 

Definition 2. The rth-order quantum binary Reed- 
Muller code of length 2'™, denoted QRM(r,m), is the 
CSS code [81, 82] whose defining X and Z parity check 
matrices are the generator matrices for RM(r, m) and its 
dual RM (m — r — 1, m) respectively. 

Notice that in this definition, somewhat confusingly, 
the quantum parity-check matrices are formed from clas- 
sical generator matrices, not classical parity-check ma- 
trices. 

We are most interested in the shortened quan- 
tum binary Reed-Muller codes, which we denote by 
QRM(r, m). These codes are formed by shortening each 
of the binary Reed-Muller codes from which it is formed. 
The process of shortening first punctures a code by re- 
moving a bit on which only row of the generator ma- 
trix has support and then expurgates it by removing 
the row in the generator matrix that had support on 
that bit. For the Reed-Muller codes, this corresponds 
to removing the first row and last column of the gen- 
erator matrix when presented in standard form, as in 
Eq. (Al). In essence, shortening a Reed-Muller code re- 
stricts the space of Boolean polynomials defining the code 
to those which have no constant term and which also 
satisfy p(0) = 0. An equivalent way of characterizing the 
shortened Reed-Muller code is as the even subcode of the 
punctured Reed-Muller code. The parameters of the re- 
sulting quantum code are \2 m — 1, 1]. Code parameters 
for small Reed-Muller codes, their duals, and their short- 
ened quantum construct are listed in Table III. Notice 
that the length of the code n does not uniquely specify 
which shortened quantum Reed-Muller code one is refer- 
ring to for n > 15. 



Appendix B: Criteria for a code to admit transversal 
Z{-n/2 k ) rotations 

The shortened quantum Reed-Muller codes 
QRM(l,k + 2) admit a transversal implementation 
of Zk by applying Zj. to each qubit in the code indepen- 
dently. This result follows, e.g. from arguments made 
by Campbell et al. in Ref. [79]. Another way to see 
this is to note that these codes obey Theorem 1 below. 
We offer this alternative approach because it may be 
generalizable in a way that others could use to find 
more efficient codes that admit Zj, transversally. It also 
relies on a lemma (Lemma 1) that naturally generalizes 



(r,m) 


(m — r — 1, m) 


[n,k,d] primal 


[n, k, d] dual 


[n,k] 


(0,1) 


(0,1) 


[2,1,2] 


[2,1,2] 





(0,2) 


(1,2) 


[4,1,4] 


[4,3,2] 





(0,3) 


(2,3) 


[8,1,8] 


[8,7,2] 





(1,3) 


(1,3) 


[8,4,4] 


[8,4,4] 


[7,1] 


(0,4) 


(3,4) 


[16,1,16] 


[16,15,2] 





(1,4) 


(2,4) 


[16,5,8] 


[16,11,4] 


[15,1] 


(0,5) 


(4,5) 


[32,1,32] 


[32,31,2] 





(1,5) 


(3,5) 


[32,6,16] 


[32,26,4] 


[31,1] 


(2,5) 


(2,5) 


[32,16,8] 


[32,16,8] 


[31,1] 


(0,6) 


(5,6) 


[64,1,64] 


[64,63,2] 





(1,6) 


(4,6) 


[64,7,32] 


[64,57,4] 


[63, 1] 


(2,6) 


(3,6) 


[64,22,32] 


[64,42,8] 


[63, 1] 



TABLE III: Parameters for (primal) Reed-Muller R(r,m) 
codes, their duals R(m — r — 1, 1), and their CSS-combined 
shortened quantum versions QRM(r, m) for small values. 
Shortened R(0, m) codes have no X generator, so the result- 
ing quantum codes are just classical codes; they are referred 
to by in the table. 



an otherwise unusual criterion of "tri-orthogonality" 
noted by Bravyi and Haah [35] for the QRM(1,4) code. 
We believe that this Lemma, which we call Ward's 
Divisibility Test, makes better contact with the classical 
coding theory literature. 

Theorem 1. A quantum [n, 1] CSS code [81, 82] with 
stabilizer generators defined by the parity check matrix 
H = diag(H x ,H z ) via 



)X 



H: 



S?:= 



Hf 



(Bl) 



where H x has rows vi,...Vk+2, implements {Z^f 
transversally if 

wt (v a(1) ■ ■ ■ v aU) ) = mod 2 k+2 ^ (B2) 

for all 1 < j < k + 2 and all a G Sj , and 

n = amod2 fc+1 , (B3) 

where '(g)' denotes the tensor product, 'wt' denotes the 
Hamming weight of a binary vector, i Sj denotes the per- 
mutation group on j items, and 'vi-'-Vj' denotes the 
componentwise product of Vi , . . . , Vj. 

When a in this Theorem is odd, gcd(a, 2 fe+1 ) = 1, 
which means we can use an algorithm like the extended 
Euclidean algorithm [83] to efficiently find numbers x and 
y such that ax + 2 k+1 y = 1. Iterating (Zk) a x times re- 
sults in a conditional phase of 7r(l — 2 k+1 y)/2 k = n/2 k ; 
in other words, (Zk) ax = Z^ when a is odd. 

Condition (B2) generalizes the tri-orthogonality con- 
dition of Bravyi and Haah [35] into a kind of (fc + 1)- 
orthogonality condition. More fundamentally, we want 
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the classical linear code generated by H to be a code 
in which every codeword has a Hamming weight divisible 
by 2 k+1 . Ward studied such divisible codes in depth and 
one of his results is that 2 fe+1 -divisibility is testable by 
the condition of Eq. (B2) [73]. More explicitly, Ward's 
Divisibility Test is captured by Lemma 1 below. (Ward's 
result is actually more general; we use a version special- 
ized to the binary case, as noted by Proposition 4.2 in 
Ref. [74].) 

Lemma 1 (Ward's Divisibility Test [73]). The binary 
linear code with generator matrix H x whose row vectors 
are V\, . . . , v/c+2 is divisible by 2 k+1 if and only if 

2 k+2 - j \wt(v ail yv a{j) ) (B4) 

for all 1 < j < k + 1 and all permutations a £ Sj. 

While Ward's Divisibility Test has the advantage of 
being an explicit algorithm for testing divisibility, it is not 
particularly efficient, as it takes a time that is exponential 
in k to execute. For codes with a high degree of structure, 
such as the shortened RM(1, k + 2) Reed-Muller codes, 
demonstrating 2 k+1 divisibility is much simpler, as noted 
in Ref. [74]. 

Proof of Theorem 1. By Ward's Divisibility Test, every 
vector v in the rowspan C of H has a Hamming weight 
divisible by 2 k+1 . Since the logical |0) for the code is 
|0) := J2vec l w ) (ig nor i n g normalization), the action of 
transversal Zu on |0) is 

ZT|S)=5>f>) (B5) 

veC 

= E( eW2 T'i-> (B6) 

vec 

= E i«> ( B7 ) 

vec 

= |0>. (B8) 



Similarly, using Eq. (B3), the action of transversal Zj- on 
(unnormalized) |1) = A|0) is 



Z® n \l) = Z® n X\0) (B9) 

= Y,Zf n X\v) (BIO) 

vec 

= $>®>®1) (Bll) 

vec 

= J2(^ /2k y~ lVl \v®l) (B12) 

vec 

= (e 47ra/2fc ) \v® 1) (B13) 
vec 

= e l7ra/2t |T), (B14) 



where 1 := (1, . . . , 1) denotes the all-ones vector, whose 
appearance comes from the fact that up to local qubit 
basis changes, X = X® n for all CSS codes. These actions 
of Z® n replicate (Zk) a on the logical basis, and therefore 
Zk implements {Zk) a transversally. □ 
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